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Abstract 

In  this  paper  we  present  conclusions  of  a  study  of  irreversibility  in  physical  processes,  using 
the  conceptual  formalism  of  the  e-machine.  The  causal  irreversibility  is  examined,  in  particular 
for  the  class  of  Flower  processes,  and  found  to  be  attributable  to  contributions  from  the  transi¬ 
tion  probability  distributions  as  well  as  the  causal  state  topology  of  a  process.  The  topological 
irreversibility  is  a  prerequisite  for  causal  irreversibility,  and  is  examined  in  more  detail  in  the 
framework  of  topological  e-machines.  Detailed  study  of  the  mechanisms  involved  in  time-reversal 
of  the  particular  model  class  of  semi-periodic  deterministic  finite  automata  allows  characteriza¬ 
tion  of  the  topological  source  of  irreversibility  in  this  context.  The  work  was  performed  as  part 
of  the  2013  UC  Davis  physics  REU  program. 
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1  Introduction  to  complexity  sciences 

Arguably  the  primary  goal  of  science  is  a  simple  one:  to  describe  the  behavior  of  physical  systems. 
The  breadth  of  this  endeavor  is  immense,  and  necessitates  the  development  of  conceptual  models 
which  replicate  only  those  features  of  a  system  of  interest  for  a  particular  purpose.  Accordingly, 
one’s  goals  are  crucial  to  devising  or  choosing  a  model,  and  the  utility  of  any  model  is  determined 
by  the  granularity  with  which  one  studies  a  system.  The  field  of  complexity  sciences  is  reflective 
of  this.  Studying  complex  systems  requires  determining  the  appropriate  level  of  abstraction,  at 
which  one  can  form  models  comprising  distinct  elements  that  collectively  exhibit  complex  emergent 
behavior.  This  framework  is  analogous  to  the  process  in  thermodynamics  by  which  individual 
particles  behaving  according  to  microscopic  force  laws  are  described  by  concepts  like  temperature 
and  pressure,  which  can  only  be  interpreted  on  a  macroscopic  scale.  Indeed,  the  notions  behind 
some  thermodynamic  concepts,  like  entropy,  can  be  applied  in  a  much  broader  context.  Accordingly, 
complexity  theory  seeks  to  describe  the  behavior  of  general  composite  systems  via  modeling  of  the 
constituents. 

Here  we  place  particular  emphasis  on  the  case  of  time  evolution  of  dynamical  systems.  We 
define  a  system  by  specifying  its  components  and  their  interactions,  identify  a  quantity  that  is 
representative  of  the  state  of  the  system  as  a  whole,  and  “measure”  it  as  the  system  evolves  over 
time,  although  in  many  cases  the  quantity  is  an  invariant.  The  reverse  can  also  be  done,  where  the 
measurement  results  are  available  and  the  structure  of  the  system  is  in  question.  In  this  case,  the 
ideal  result  is  one  that  represents  the  data  arising  from  a  system  with  as  few  prior  assumptions  as 
is  possible. 

1.1  Presentations  of  processes 

As  we  are  concerned  with  systems  which  are  evolving  in  time,  and  wish  to  make  a  comment  on  the 
nature  of  this  evolution,  the  concept  of  the  passage  of  time  is  important  in  motivating  a  particular 
construction.  The  following  definitions  and  notation  are  taken  from  references  [2],  [3],  and  [4]. 
Consider  some  system  (henceforth,  process  V)  and  an  instrument  which,  with  some  sampling  rate, 
makes  discrete  observations  of  the  process,  yielding  a  possibly  bi-infinite  stream  of  data: 

...A_3A_2A_iAoAiA2A3... 

Each  of  the  Xi  is  a  random  variable  representing  the  data  point  recorded  at  time  t  =  ti.  Assume 
the  Xi  to  be  discrete  random  variables  which  are  in  general  dependent,  all  of  which  have  probability 
mass  function  fxi-  We  construct  X  =  \J^  im(/xj.  That  is,  X  is  the  set  of  all  possible  measurement 
outcomes,  or  the  alphabet.  We  refer  to  any  particular  realization  of  measurement  data  as  a  sequence 
of  Xi  G  X. 

A  good  presentation  of  this  data  is  one  which  captures  the  “macroscopic”  features  of  the  pro¬ 
cess  in  a  more  compact  manner  than  the  stream  of  data  itself.  We  also  desire  predictive  capac¬ 
ity  in  the  model.  To  that  end,  we  interpret  the  process  as  a  channel  transmitting  information 
about  the  system’s  history  to  its  future  states.  We  partition  the  sequence  by  defining  the  past 
^  =  . . .  X-2,X-2X-i  and  future  ^  =  X0X1X2X3.  Any  particular  past  1F  =  . . .  X-3X-2X-1  is 
referred  to  as  a  history.  Predictive  capacity  can  now  be  more  precisely  defined  as  the  ability  to 
define  a  probability  distribution  Pr(^|'t)  over  the  space  of  all  possible  futures  conditioned  on 
a  given  history  ^ . 
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1.2  e- machines 

To  construct  such  a  model  of  a  structured  process,  we  use  the  histories  themselves.  We  use  the 
equivalence  relation  that,  given  our  predictive  model,  ^  ~  if  Pr(^l^)  =  Pr(^|^')-  That  is, 
the  equivalence  classes  are  composed  of  those  histories  that  result  in  identical  predictions.  These 
equivalence  classes  are  referred  to  as  causal  states,  the  set  of  which,  5,  is  a  partitioning  of  the  space 
of  histories.  Causal  states  are  optimally  predictive;  that  is,  by  definition,  knowledge  of  the  causal 
state  of  a  process  is  equivalent  to  knowing  the  entire  history  of  the  process. 

Any  process  with  a  history  1F  =  . . .  X-3X-2X-1  can  thus  be  said  to  be  in  a  causal  state  So-  The 
next  symbol  observed,  xq,  when  added  to  the  history  may  cause  a  change  in  the  causal  state  of  the 
system,  taking  Sq  to  Si-  In  this  case  we  refer  to  a  transition  in  the  system  between  causal  states 
upon  observation  of  the  symbol.  We  can  construct  a  set  of  transition  matrices  T  =  {T*  '■  x  £  X} 
comprised  of  a  |5|  x  |5|  transition  matrix  corresponding  to  each  symbol  that  could  be  observed. 
The  element  T?-  of  a  particular  matrix  specifies  the  probability  of  a  transition  from  causal  state  i 
to  state  j  upon  observing  symbol  x.  Inherent  in  this  description  is  an  assumption  that  the  set  of 
causal  states  is  finite,  or  requires  application  of  Zorn’s  lemma.  Here  we  deal  only  with  finite  S.  The 
set  of  causal  states  with  the  transition  matrices  of  the  process  V  comprise  the  e-machine  AiiV). 

We  assume  a  process  to  be  stationary,  giving  us  freedom  in  time  indexing.  We  will  define  Xq 
to  coincide  with  the  “present”  (formally,  the  first  symbol  of  the  future),  and,  notationally,  we  let 
the  state  Sq  transition  to  Si  upon  observation  of  xi- 

Because  they  carry  the  same  information  as  a  full  process  history,  causal  states  are  Markovian 
[4].  However,  an  e-machine  is  not  a  Markov  chain  because  the  states  of  the  system  themselves  are 
not  observable;  only  the  symbols  are.  This  property  corresponds  to  a  hidden  Markov  model,  or 
HMM.  It  suggests  a  graphical  representation,  which  is  presented  in  figure  1  for  dummy  processes 
illustrating  the  difference  between  the  two  Markov  models. 


C 

(a)  (b) 

Figure  1:  Examples  of  a  state-output  Markov  chain  (a)  and  hidden  Markov  model  (b)  which  have 
the  same  alphabet  X  =  {A,  B,C} 

The  symbol-independent  transition  matrix  can  be  obtained  by  integrating  T  = 

We  may  apply  T  to  a  |5|-length  row  vector  containing  the  probability  of  finding  the  system  in 
each  state.  Let  tt  be  a  left  eigenvector  of  T;  then  vrT  =  Att  for  some  A.  But  tt  and  Att  exist  in 
probability  space,  constraining  that  their  entries  must  sum  to  1.  Thus  the  eigenvalue  A  =  1  and 
the  eigenvector  equation  is  simply  ttT  =  tt.  The  tt  vector  solution  to  this  equation  is  the  stationary 
state  distribution  of  the  process,  the  asymptotic  limit  of  the  likelihood  of  finding  the  system  in  each 
state. 

Finally,  we  note  that  the  class  of  e-machine  s  is  a  subset  of  that  of  HMMs,  as  an  e-machine 
must  possess  two  additional  properties  [3].  First  is  unifilarity,  or  the  property  that  each  symbol 
uniquely  determines  a  causal  state,  given  the  current  state.  Graphically,  this  corresponds  to  no 
more  than  one  arrow  per  output  symbol  originating  from  each  state.  The  second  property  is  that 
an  e-machine  is  minimal;  there  is  no  simpler  HMM  that  gives  identical  predictions.  To  recapitulate, 
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the  e-machine  M(V)  is  the  unique  (proof  given  in  [6])  minimal  unihlar  HMM  of  a  process  V,  which 
drastically  simplifies  the  difficulty  of  calculating  invariants  of  the  process  and  provides  optimal 
predictive  capacity. 

1.3  Characterizing  information-theoretic  properties 

The  e-machine  formalism  makes  explicit  the  derivation  of  several  invariant  properties  we  wish  to 
define  for  a  given  process.  These  rely  on  the  concept  of  Shannon  entropy.  The  Shannon  entropy 
is  a  property  of  a  random  variable  X  that  measures  the  expected  value  of  the  “surprise”  of  its 
outcomes.  The  degree  of  surprise,  or  information,  of  some  particular  outcome  i  with  probability  pi 
is  defined  as  li  =  —  log2  Pi  [1] .  The  expectation  value,  denoted  H,  is  given  by 

=  [  fxix)I{x)  =  ^Pa;(-log2Px)  (1) 

We  wish  to  extend  this  notion  to  the  more  general  case  of  a  process,  rather  than  a  single  random 
variable,  via  its  e-machine.  Henceforth  we  will  use  Ig  to  refer  to  the  base-2  logarithm. 

The  first  quantity  obtained  is  the  statistical  complexity  which  we  interpret  as  a  measure 
of  the  degree  of  structure  of  a  process,  or  the  amount  of  historical  information  communicated  by 
the  causal  states.  It  is  accordingly  measured  in  bits  and  is  a  straightforward  generalization  of  the 
Shannon  entropy.  Instead  of  the  entropy  of  the  observed  symbols,  it  is  the  entropy  of  the  causal 
states  of  a  process: 

C^  =  il[5]  =  - J]Pr(5)lgPr(5)  (2) 

ses 

The  probability  distribution  over  the  causal  states  is  taken  from  the  stationary  state  vector  vr,  as 
described  in  section  1.2. 

How  do  we  define  the  entropy  contained  in  the  symbols  themselves?  This  is  accomplished 
through  the  entropy  rate  h^,  which  is  given  in  units  of  bits  per  symbol  and  is  likewise  an  invariant 
of  a  process.  It  measures  the  rate  at  which  new  information  is  generated  by  the  symbols,  or  its 
degree  of  stochasticity.  This  is  somewhat  less  straightforward  to  derive,  as  it  requires  an  asymptotic 
limit  to  eliminate  the  vagaries  of  any  finite  sequence  of  symbols.  The  e-machine  formalism  simplifies 
this  process.  Because  inhnite-length  histories  are  reduced  to  a  set  of  causal  states  S,  we  calculate 
the  entropy  of  the  symbols  conditioned  on  the  set  of  states;  the  unihlarity  property  discussed  in 
section  1.2  ensures  that  this  definition  coincides  with  h^: 

hf,  =  H[X\S]  =  -  ^  Pr(5)  Y,  Pr(®l*5')  lgPr(x|50 

ses  xex 

s'&s 

=  -X;Pr(S)^rW|gT«  (3) 

S  x,S' 

Examples  of  these  properties,  as  well  as  introductions  to  some  of  the  canonical  e-machines,  are 
given  in  the  highly-recommended  Appendix  A.  Also  note  that,  while  it  is  not  discussed  here,  the 
concept  of  excess  entropy  E  is  a  similar  quantity  (the  mutual  information  between  the  past  and 
future)  that  is  foundational  to  this  formalism  [2]. 
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2  Quantifying  irreversible  behavior 

We  mentioned  earlier  that  a  stationary  process  allows  one  freedom  with  regard  to  shifting  the 
origin  of  time-indexing.  We  now  investigate  taking  another  liberty  by  scanning  the  time  series 
in  the  time-reversed  direction.  We  may  rederive  the  fundamentals  of  our  analysis,  instead  using 
the  future  to  retrodict  the  past.  This  gives  a  new  set  of  retrodictive  causal  states  S~ ,  along  with 
transition  matrices  T~  comprising  the  retrodictive  e-machine  Ai~ .  Henceforth  we  will  also  refer 
to  the  forward-time  versions  of  these  with  a  superscript: 

It  is  a  result  of  [3]  that  the  entropy  rate  of  a  time-reversed  process  is  the  same  as  that  of 
the  forward  process,  that  H[Xo|5_i]  =  H[X_i|5o].  This  corresponds  to  the  intuitive  notion  that 
the  amount  of  information  that  transmitted  by  a  process  is  independent  of  the  direction  of  time. 
The  same  is  not  true  of  the  statistical  complexity:  many  simple  counterexamples  exist  showing 
that  C'^  7^  C~ .  Here  we  present  the  Random  Noisy  Copy  (RnC)  Process  as  an  example  in  order 
to  demonstrate  the  process  of  time  reversal  of  an  e-machine. 

2.1  Time  reversal 


Figure  2:  The  random  noisy  copy  (RnC)  e-machine 


Reversal  of  an  e-machine  At"*"  is  a  central  process  to  the  topic  of  this  report,  but  one  that  can  be 
accomplished  algorithmically.  We  will  perform  the  operation  once  here  on  the  RnC  process  in  order 
to  convey  the  intuition.  It  is  performed  in  two  steps,  denoted  by  two  maps:  T,  the  time-reversal 
operator  and  U,  which  computes  the  mixed- state  presentation  in  order  to  restore  unifilarity.  The 
RnC  process,  in  brief,  generates  symbols  in  couplets  with  an  alphabet  X  =  TLj^.  The  first  symbol  in 
each  couplet  is  a  0  with  probability  p.  If  this  is  the  case,  the  next  symbol  is  also  a  0  with  certainty. 
If  the  first  symbol  is  a  1,  the  next  has  a  probability  q  of  switching  to  a  0;  otherwise  it  is  also  a 
1.  Thus,  there  are  three  causal  states:  the  “reset”  position,  after  a  couplet  is  completed,  and  one 
state  each  for  the  initial  symbol  being  a  0  or  1.  These  states  are  S  =  {a,  /3, 7},  respectively.  Figure 
2  gives  the  visual  e-machine.  From  this  we  see  that 
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Time- reversing  the  e-machine  is  a  straightforward  operation  on  the  transition  matrices  but  is  not 
guaranteed  to  generate  another  e-machine.  We  denote  the  states  of  the  time-reversed  HMM  as  TZ 
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rather  than  S: 


fPj^=PT{X  =  x,R\R') 


=  r, 


(x) 


RR' 


Pr{R) 

Pr(i?') 


(4) 


That  is,  the  time-reversed  transition  matrices  are  simply  the  transposes  of  those  of  the  forward-time 
presentation,  normalized  by  the  ratios  of  the  stationary  state  distributions,  which  are  equivalent 
to  the  ratios  of  the  stationary  causal  state  distributions  =  tt).  This  specifies  T{A4^)  = 
which  has  the  same  number  of  causal  states  as  Ai^.  Applying  T  to  the  RnC  gives 


Ji(0)  _ 

0  p  q{l-p) 

1  0  0 

0  0  {l-q){l-p) 

0  0  0 

0  0  0 

10  0 

We  are  now  ready  to  apply  the  mixed-state  presentation  U(Ai~^)  to  obtain  an  e- machine  giving 
the  time-reversed  model  Ai~ .  The  first  step  is  to  begin  with  the  stationary  state  tt,  a  possibly 
transient  causal  state  that  can  be  viewed  as  representing  a  lack  of  knowledge  about  the  state.  This 
is  the  state  of  the  system  before  any  symbols  ore  observed.  In  general,  we  will  denote  the  system 
state  after  observing  word  w  as  v{w).  After  making  an  observation,  we  are  now  in  one  of  two 
states:  r{0)  or  ^{l).  These  are  simply  created  by  evolving  the  stationary  state  by  the  appropriate 
transition  matrix  and  normalizing: 


zz(O) 


7rT(°) 

|7rT(0)| 


Ki) 


■kTW 

|7rr(l)| 


(5) 


Where  |  •  |  denotes  the  1-norm  of  the  stochastic  vector.  Each  of  these  represents  a  causal  state  of 
the  e-machine,  which  may  or  may  not  be  transient.  The  quantities  lyrT^^^I  and  represent  the 

possibility  of  transition  to  each  state  from  the  initial  stationary  state.  The  next  step  in  the  process 
is  to  prepend  the  next  symbols:  r{0)  leads  to  J^(IO)  and  r{00),  and  r(1)  to  J^(Ol)  and  To 

calculate  the  resultant  vectors  for  each  of  these,  the  formula  given  in  equation  5  can  be  iterated: 


jv(OO) 

^(01) 


|!/(0)f(0)| 


Kio) 

Kii) 


u{0)tW 

|^(0)f(i)| 


Again,  the  denominators  represent  the  transition  probabilities  and  must  be  recorded  in  order  to 
construct  the  time-reversed  model.  It  may  be  that  some  of  the  r  states  given  by  these  calcula¬ 
tions  are  identical  to  those  already  found.  In  this  case,  the  sequence  including  them  need  not  be 
continued.  This  process  of  prepending  symbols  and  calculating  transition  probabilities  is  repeated 
until  no  unique  states  are  found.  Then  the  e-machine  is  simply  this  collection  of  states  S~ ,  along 
with  the  transition  matrices  that  can  be  derived  from  the  probabilities.  It  may  be  that  transient 
states  separate  from  the  stationary  state  exist;  depending  on  the  purpose,  these  may  be  retained  or 
discarded  from  the  model.  Performing  the  calculations  to  evaluate  each  individual  state  is  tedious; 
the  result  in  the  example  case  is  that  the  recurrent  causal  states  of  the  reverse  e-machine  are  given 

by  1/(10),  1/(100),  j/(1001). 
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Figure  3:  The  time- reversed  e-machine  of  the  RnC  process 


The  forward  statistical  complexity  is 

cl  =  -  E  Pr(5)  lgPr(5) 

S£S+ 

=  -^lg^+plgf  +  {l-p)lgiY^) 

=  -^  {-1  +  plgp  -  plg2  +  {1  -  p)  lg(l  -p)-{l-p)  lg2) 

=  l-^{plgp+{l-p)\g{l-p))  (6) 

Similarly,  the  reverse  statistical  complexity  is  calculated 

C-  =  -Yl  Pr(5)lgPr(5) 
ses- 

= -^Ig  ^  +  (1  -  P)(l  -  ?)  Ig  +  (P  +  5(1  -  rt)  Ig 

=  1-5  {(1  -P)(l  -  9)lg(l  -P)(l  -  9)  +  (P  +  5(1  -P))lg(ji  +  9(1  -P))) 

(7) 


(plgp  +{l-p)  lg(l  -  p))  -  ((1  -  p){l  -  q)  lg(l  -  p){l  -  q)  +  {p  +  q{l  -  p))  \g{p  +  ^(1  -  p))) 

(8) 


We  present  plots  of  both  S  and  over  the  space  of  p,  q  values  for  the  RnC  process  in  figure  4. 


2.2  Definitions  of  reversibility 

We  may  use  the  general  discrepancy  between  the  statistical  complexities  of  A4'''  and  to  provide 
a  coarse  measure  of  the  degree  to  which  a  process  is  irreversible,  which  motivates  the  definition  of 
the  causal  irreversibility  E: 

s  =  c;  -  c;  (9) 

This  raises  questions  as  to  the  intuition  of  reversibility.  We  describe  a  process  V  as  mieroscop- 
ically  reversible  if  A4~^(V)  =  Ai~{V).  R  is  clear  that  if  this  is  the  case,  then  H  =  0.  However,  the 
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Figure  4:  Invariants  of  the  RnC  process  over  the  space  of  p,  q  values 


converse  is  not  true;  it  is  possible  that  H  =  0  for  a  process  that  is  not  microscopically  reversible.  A 
simple  example  presented  in  [3]  is  that  the  process  generating  . . .  123123123 ...  is  not  microscopn 
ically  reversible,  as  the  probability  of  the  word  123  is  not  the  same  in  the  forward  e-machine  as 
in  the  reverse  (which  generates  . . .  321321321 . . .),  but  S  =  0,  as  the  entropy  of  each  state  in  both 
machines  is  identical.  Furthermore,  an  automorphism  on  the  alphabet  of  the  reverse  machine  is 
sufficient  to  generate  identical  processes.  Clearly  the  definition  of  microscopically  reversible  is  too 
strict  to  capture  all  of  the  relevant  behavior,  but  causal  irreversibility  is  somewhat  imprecise,  and 
obscmres  a  process’s  structure. 

2.3  The  {AT,  M}-F lower  Process 


(a)  (b) 

Figure  5:  Example  forward  (a)  and  reverse  (b)  e-machines  for  a  {4, 3}-Flower  process.  Symbols 
{1,2,3}  correspond  to  the  three  forward  “petal”  states,  while  symbols  {A,B}  correspond  to  the 
two  time-reversed  petals. 

We  seek  a  class  of  processes  that  can  more  effectively  probe  these  definitions.  To  that  end, 
we  introduce  the  Flower  processes,  a  family  of  e-machines  taking  two  parameters,  which  allows 
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a  fully  programmable  number  of  causal  states  in  both  the  forward  and  reverse  e-machines.  The 
Flower  process  is  structured  such  that  the  forward-time  process  contains  a  central  causal  state 
connected  to  —  1  “petals.”  There  is  an  outgoing  path  from  the  node  to  each  petal,  using  unique 
symbols,  and  the  number  of  return  paths  to  the  central  state  from  each  petal  is  equal  to  the 
number  of  time-reversed  causal  state  petals,  M  —  1.  Thus,  an  alphabet  of  size  1^1  =  N  +  M  —  2  is 
necessary  for  this  implementation.  If  there  is  degeneracy  in  the  return  transition  probabilities  to 
the  central  state  in  the  forward  model,  these  will  collapse  in  the  time-reversed  model;  thus  these 
distributions  must  be  unique  to  each  petal,  as  can  be  observed  in  hgure  5.  We  will  use  the  Flower 
with  uniform  “outgoing”  distribution  (leaving  central  state)  in  the  forward-time  direction.  In  these 
circumstances,  the  number  of  petals  in  each  state  is  directly  tied  to  the  causal  irreversibility. 


Mean  irreversibility  H  of  Flower  process 


4  6  8  10  12  14  16 

Forward  causal  states  N 


Figure  6:  The  correlation  between  irreversibility  and  N,  M  is  seen  to  be  H  oc  —  M,  or  more 
suggestively,  (7+  -  oc  \S+\  -  |5"| 

Figure  6  shows  the  mean  H  of  750  Flowers  for  every  e-machine  with  N,M  €  {3, . . . ,  16},  with 
uniform  outgoing  forward-time  distributions  and  randomly-generated,  but  unique,  incoming  proba¬ 
bilities.  They  show  that  for  the  Flower  process,  the  difference  in  the  number  of  forward  and  reverse 
states  in  large  part  determines  the  irreversibility.  These  results  are  easy  to  interpret.  Adding  states 
to  one  direction  of  the  machine  boosts  the  complexity  in  that  direction  while  having  little  effect 
on  the  complexity  in  the  other.  Thus,  when  the  number  of  states  is  the  same,  the  complexities  are 
the  same.  The  cost  of  increasing  the  number  of  states  in  use  in  the  Flower  process  is  the  increasing 
alphabet. 

The  effect  of  the  “incoming”  probability  distributions  on  the  irreversibility  of  the  Flower  process 
is  the  subject  of  a  proof  presented  in  Appendix  B.  While  the  process  is  tedious,  the  results  are 
illustrated  by  figure  7.  If  the  transition  probabilities  are  made  too  extreme  (that  is,  one  set  at  1 
and  the  others  at  0),  the  machine’s  topology  is  altered  and  it  no  longer  lies  within  the  family  of 
Flower  processes.  However,  as  can  be  seen  from  the  figure  and  in  the  proof,  the  second  derivative 
of  H  is  positive  along  the  space  of  transition  probabilities.  Thus,  the  extremum  within  this  simplex 
is  a  minimum,  with  what  we  shall  refer  to  as  phase  transitions  (that  is,  alterations  to  the  graph 
structure  of  the  e-machine)  occurring  at  the  boundaries  and  the  minimum. 
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Figure  7:  Irreversibility  is  evidently  minimized  for  a  uniform  probability  distribution,  in  this  case 
for  =  1.5,  as  the  number  of  forward  petals  is  3  and  the  number  of  reverse  petals  is  2. 

3  Topological  irreversibility 

The  existence  of  phase  transitions  in  e-machine  topology  at  the  extrema  suggests  that  the  graph 
structure  of  a  process  is  more  fundamental  in  determining  the  irreversibility  than  the  transition 
probability  distributions.  We  thus  turn  our  attention  to  structures  that  can  more  clearly  demon¬ 
strate  the  relevant  aspects  of  a  process. 

3.1  Finite  automata 

To  do  so,  it  is  helpful  to  venture  outside  the  formalism  of  the  e-machine.  We  reproduce  the 
definition  from  automaton  theory  given  in  [5] :  a  deterministic  finite  automaton  (DFA)  is  a  5-tuple 
{Q,T,,6,qo,F),  where  Q  is  a  finite  set  of  states,  S  is  a  discrete  alphabet,  5  :  Q  x  S  — )•  Q  is  the 
transition  function,  qq  is  the  start  state,  and  F  C  Q  is  the  set  of  accepting  states.  To  match  the 
notation  used  in  the  definition  of  the  DFA  with  our  previous  notation:  Q  =  S,  the  causal  states,  and 
S  =  A,  the  process  alphabet.  The  transition  function  6  is  analogous  to  the  set  of  transition  matrices 
T,  and  is  interpretable  as  the  act  of  making  an  observation;  that  6  is  well-defined  corresponds  to 
the  property  of  unifilarity  and  is  the  source  of  the  deterministic  qualifier  of  the  DFA. 

We  use  the  concept  of  a  DFA  in  a  restricted  sense,  discarding  some  of  its  utility  as  a  theoretical 
concept  in  order  to  provide  a  closer  abstraction  of  the  e-machine  formalism.  To  that  end  we  set 
the  start  state  qo  =  tt,  representing  an  initial  lack  of  knowledge  of  the  state  of  the  system.  We  also 
set  F  =  Q  =  A;  that  is,  any  causal  state  is  allowed  to  be  the  “final”  state  of  the  system.  These 
stipulations  discard  the  notion  of  an  initial  and  final  point  in  time  for  the  system,  allowing  us  to 
continue  treating  it  as  a  bi-infinite  series  of  random  variables,  a  process  evolving  in  time  by  continued 
application  of  the  6  function  on  the  present  state.  The  advantage  over  the  e-machine  formalism  is 
that  transition  probabilities  are  discarded;  either  a  particular  transition  in  states  is  allowed  by  6  or 
it  is  not. 

Examples  of  DFAs  are  given  in  figure  8.  It  should  be  noted  that  we  now  restrict  the  alphabet  to 
Z/2  =  {0,1}.  We  claim  that  this  induces  no  loss  of  generality.  Given  a  process  with  some  alphabet 
A,  a  binary  encoding  can  be  constructed  of  the  symbols  in  A,  essentially  mapping  the  process  to 
another  with  an  alphabet  of  Z/2  at  the  cost  of  increased  complexity  in  the  causal  states. 

Our  definition  of  causal  irreversibility  S  is  not  calculable  for  a  DFA,  except  by  assuming  some 
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Figure  8:  Examples  of  discrete  finite  automata  with  binary  alphabets,  showing  similarities  to  graph 
presentations  of  e-machines  without  transition  probabilities 


probability  distribution  not  provided  by  the  formalism.  We  therefore  require  a  new  test  of  ir¬ 
reversibility.  We  refer  to  this  as  topological  reversibility  and  define  it  as  the  existence  of  a  graph 
isomorphism  between  the  forward  and  reverse  processes,  matching  all  states  and  transitions.  A  lack 
of  such  an  isomorphism  between  the  forward  and  time-reversed  DFAs  is  topological  irreversibility. 
There  is  a  caveat  to  this  definition,  as  it  is  possible  for  a  DFA  to  have  a  finite  number  of  states 
in  the  forward  direction  but  to  have  the  determinization  algorithm  of  the  time  reversed  machine 
not  converge;  that  is,  there  is  an  infinite  number  of  states  in  the  reversed  DFA.  We  refer  to  a  DFA 
with  this  property  as  infinitely  irreversible;  its  reverse  cannot  be  determinized.  Each  DFA  fits  into 
one  of  these  three  categories.  As  an  auxiliary  component  of  the  research  performed  for  this  project 
we  surveyed  all  topological  e-machines  (DFAs,  as  used  here)  below  a  certain  size  in  an  attempt 
to  gauge  the  relative  incidence  of  each  of  these  characterizations.  The  results  of  that  study  are 
presented  in  Appendix  C. 

3.2  Semiperiodic  DFAs 

We  restrict  our  analysis  to  a  particular  model  class  of  DFAs.  A  periodic  DFA  is  one  which  repeats 
a  periodic  sequence  of  some  finite  length,  and  has  a  cyclic  graph  representation  like  that  shown  in 
figure  8(c).  This  model  class  is  closed  under  time  reversal,  as  the  automata  produced  by  application 
of  the  T  operator  are  deterministic  and  the  action  of  U  is  trivial.  This  property  is  referred  to  as 
counifilarity.  We  introduce  one  feature  to  this  family  to  produce  the  class  of  semiperiodic  DFAs, 
a  phase  slip  transition  or  defect  that  skips  a  certain  number  of  states.  Because  this  definition 
is  not  found  in  the  literature,  we  state  it  precisely:  A  semi-periodic  DFA  (SPDFA)  is  one  built 
on  a  periodic  DFA,  with  the  addition  of  n  nonoverlapping  phase  slip,  or  defect,  transitions  in 
the  forward  direction,  of  lengths  {ri,...,r„}  and  spacings  which  bypass  one  or  more 

consecutive  states.  Note  that  our  intuition  of  a  directionality  (‘forward’,  ‘bypass’)  comes  from 
the  periodic  backbone  word  of  the  DFA.  An  example  of  a  SPDFA  is  given  in  figure  9,  and  it  is 
reproduced  in  figure  10,  showing  the  crucial  property:  though  the  model  is  simple,  by  adding  the 
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defect  transitions  we  can  induce  topological  irreversibility.  The  specification  of  the  r  and  I  values 
characterizing  a  SPDFA  assists  in  identifying  topological  irreversibility.  If  the  time-reversed  r~  and 
l~  sequences  are  not  cyclic  permutations  (permuted  by  the  same  number  of  indices)  of  r'^  and 
then  the  SPDFA  is  topologically  irreversible. 


h  =  2  <r 


Figure  9:  Example  SPDFA,  showing  the  specifications  of  the  I  =  {1,2}  and  r  =  {4,3}  parameters. 
Note  that  |<S|  =  h- 


Under  what  circumstances  is  a  SPDFA  reversible,  irreversible,  or  infinitely  irreversible?  As 
defined  here,  a  SPDFA  with  n  =  1  is  categorically  reversible,  for  reasons  that  will  become  clear. 
The  simplest  system  that  shows  irreversible  behavior  is  the  n  =  2  case.  Therefore,  we  focus  on  it. 
An  example  of  an  irreversible  n  =  2  system  is  shown  in  figure  10,  with  the  chosen  initial  phase  0o 
indicated.  A  general  property  can  be  seen  in  this  example,  namely  that  the  I  values,  the  lengths 
of  the  defects,  stay  constant  under  time-reversal.  This  is  true  as  long  as  the  time-reversed  DFA 
remains  within  the  model  class.  It  also  shows  why  a  n  =  1  system  must  be  reversible,  as  the  I  value 
being  constant  implies  that  the  r  value  is  constant  (by  the  relationship  given  in  figure  9).  In  the 
n  >  1  case,  we  may  attempt  to  quantify  the  amount  of  irreversibility  by  calculating 

Armin  =  min  r'*’ —  min  r“  (10) 

This  is  a  particularly  explicit  measure  in  the  n  =  2  case. 
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Figure  10:  Demonstration  of  irreversibility  of  the  SPDFA  shown  in  figure  9,  with  initial  phase 
indicated 

3.3  Ensemble  descriptions 

In  order  to  determine  the  effect  on  irreversibility  of  the  r-values  and  the  symbol  patterns  in  the 
neighborhood  of  the  defects,  we  focus  on  a  single,  long  periodic  word  backbone  (60  states)  and  fixed 
slip  lengths  I  —  {1, 1}.  We  then  advance  the  two  defects  around  the  word,  at  each  site  measuring 
Arinin-  For  cases  where  the  defects  overlap  (which  by  definition  breaks  the  model  class)  and  those 
where  the  defects  in  the  time- reversed  DFA  overlap  (again  breaking  the  model  class),  no  value 
is  computed.  In  this  way,  by  using  a  single  SPDFA  we  can  acquire  an  ensemble  description  of 
irreversible  behavior  as  a  function  of  defect  parameters  and  the  neighboring  symbols.  The  results 
for  two  words  are  given  in  figures  11  and  12,  with  the  axes  representing  the  state  of  origination  of 
one  of  the  defect  transitions,  each  skipping  a  single  state. 

The  first  graph  shows  a  randomly-generated  word  which  has  a  plaid  structure  indicating  bands 
of  high  and  low  irreversibility  based  on  the  position  in  the  word  of  each  defect.  The  plot  is  symmetric 
about  the  x  —  y  line,  as  the  defects  are  identical.  If  the  graph  is  matched  to  the  word,  it  can  be 
seen  that  the  strong  stripes  observed  at  sites  24  and  29  correspond  to  the  ends  two  longest  blocks 
of  repeated  symbols,  ‘00000’  (sites  20-24)  and  ‘11111’  (sites  25-29).  However,  within  the  blocks 
irreversibility  remains  low.  Similarly,  sections  of  very  low  irreversibility  correspond  to  sites  8-13 
and  40-44,  subwords  consisting  of  alternating  symbols,  but  no  large  feature  is  seen  at  the  end  of 
the  section. 

The  word  used  for  the  second  graph,  figure  12,  was  created  in  response  to  these  observations  of 
the  previous  one,  and  displays  a  far  higher  degree  of  structure.  In  particular,  the  very  long  section 
of  repeated  symbols  in  the  middle  of  the  word  results  in  a  similarly  very  strong  site  of  irreversibility 
at  position  37.  The  section  of  alternating  symbols  that  follows  shows  low  irreversibility  throughout. 
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Change  in  r-  under  time  reversai 


Figure  11:  Word  110010001010101100010000011111000100010010101001110110001100 


10.0 
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5.0 
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-7. 


5 


-10.0 


Figure  12:  Word  010010101110101001010010100000000000010101010101010101001010 
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4  Sources  of  irreversibility 

We  propose  a  local,  algorithmic  rule  that  can  calculate  all  of  the  irreversibility  observed  in  figures 
11  and  12.  It  is  as  follows:  for  all  defects  in  the  SPDFA  which  are  not  locally  counifilar, 

1.  Identify  subword  that  would  be  encapsulated  by  time- reversed  defect 

2.  Total  shift  distance  is  equal  to  the  number  of  times  this  subword  was  repeated  in  the  periodic 
word  prior  to  the  defect 

Why  does  this  rule  work?  Figure  13  shows  the  prediction  generated  by  the  rule  applied  to  the 
randomly-generated  SPDFA  word,  and  the  differential  between  the  actual  value  and  the  prediction, 
showing  that  it  is  accurate  for  all  of  the  SPDFAs  which  remain  in  the  model  class.  It  can  be  grasped 
intuitively  by  understanding  the  graphs  of  semiperiodic  structure  as  a  system  of  pathways.  One 
can  traverse  the  DFA  by  following  any  allowed  pathway  in  the  forward  direction  and  observing  the 
symbol.  Because  the  process  is  deterministic,  one  always  knows  the  current  state  after  observing 
the  symbol;  thus,  the  state  of  the  system  and  one’s  knowledge  of  the  state  are  identical. 

Consider  now  applying  the  time-reversal  operator.  The  directionality  of  all  pathways  flips,  and 
one  suddenly  cannot  make  a  statement  about  what  state  the  system  is  in.  One  can  now  traverse 
the  reversed  system,  but  upon  encountering  a  site  of  nonunifilarity  the  same  symbol  is  observed  for 
multiple  choices  of  pathway.  Thus,  it  cannot  be  known  which  path  has  been  taken,  and  the  state 
of  mind  of  the  observer  is  a  combination  of  the  possible  states — in  this  case,  the  next  state  in  the 
periodic  word  and  the  one  arrived  at  by  skipping  over  the  encapsulated  subword.  One’s  knowledge 
of  the  state  deviates  from  the  state  itself.  This  lack  of  knowledge  (state-mixing)  persists  until 
reaching  a  synchronizing  symbol.  As  long  as  the  encapsulated  subword  and  the  word  observed 
by  taking  the  time-reversed  defect  transition  are  identical,  the  state  is  mixed  and  there  is  no 
distinction  made  between  having  taken  the  defect  transition  or  having  advanced  in  the  (reversed) 
periodic  word.  Once  the  first  symbol  is  observed  that  differentiates  these  two  options,  then  the 
path  taken  becomes  known,  and  we  observe  the  transition  to  have  taken  place  at  this  location. 
This  provides  the  mechanism  by  which  defect  transitions  are  moved  through  a  combination  of  local 
counifilarity  and  repetition  of  the  encapsulated  subword,  providing  the  entirety  of  the  irreversibility 
of  the  SPDFA  class. 


Figure  13:  Output  and  comparison  of  the  SPDFA  time-reversal  prediction  rule  given  in  section  4 
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5  Conclusion  &;  acknowledgments 

We  have  found  using  the  formalism  of  the  e-machine  that  irreversibility  in  a  process  stems  from 
two  distinct  sources:  the  probability  distributions  and  the  topology.  The  system  topology  is  the 
more  fundamental,  in  that  topological  irreversibility  is  a  prerequisite  for  causal  irreversibility,  an 
invariant  of  a  process  that  can  be  tuned  using  transition  probabilities.  The  “defect”  motif  presented 
for  topological  e-machines  is  pattern  of  irreversibility  driven  only  by  the  local  process  structure. 

As  has  been  stated,  the  definition  of  the  semi-periodic  DFA  provides  a  very  narrow  set  of 
circumstances  with  which  to  work;  the  value  of  understanding  irreversibility  in  this  system  is  not 
of  itself,  but  as  a  beginning  to  understanding  other  processes.  The  systems  presented  here  are  very 
close  to  trivial,  but  demonstrate  the  beginnings  of  behavior  that  we  observe  much  more  generally  in 
larger  systems,  as  shown  in  Appendix  C.  This  makes  it  a  good  beginning,  possibly  a  “zero-order” 
term,  for  characterizing  time  irreversibility  as  a  whole.  For  example,  if  a  complicated  e-machine  is 
structured  locally  like  the  semiperiodic  model,  one  could  use  the  rule  given  here  to  make  a  prediction 
of  the  locally  irreversible  behavior.  It  could  even  be  possible  to  identify  other  simple  systems  and, 
based  on  deconstruction  into  fundamental  patterns,  to  characterize  their  interaction  in  a  larger 
e-machine  and  predict  the  irreversibility  of  the  entire  system.  This  work  may  provide  the  initial 
step  to  a  project  of  this  nature. 

This  project  was  advised  by  Dr.  Jim  Crutchfield,  whose  deep  experience  with  this  held  was 
invaluable.  John  Mahoney  expended  a  great  deal  of  his  own  time  and  thought  during  the  summer 
on  helping  us  out  with  the  specihc  tasks  required  as  well  as  planning  and  managing  the  overall  arc 
and  goals  of  the  project.  Chris  Strelioff  and  Ryan  James  were  likewise  of  great  help  with  learning 
Python  in  addition  to  the  excellent  CMPy  package  and  the  scheduler  of  the  CSC  Hive  cluster. 
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A  Sample  processes  and  invariants 

In  order  to  demonstrate  the  intuition  of  the  and  invariants  and  the  state  vector  vr,  they 
are  explicitly  calculated  here  for  some  of  the  most  basic  canonical  processes.  These  examples  are 
motivated  by  those  given  in  reference  [3]. 

Fair  coin:  In  the  fair-coin  model,  each  random  variable  Xi  within  the  time  series  is  IID  with  the 
following  distribution:  {{H,  ^),  (T,  ^)}.  Because  of  this  property,  there  is  no  interaction  between 
any  symbols.  We  therefore  expect  no  complexity  to  be  present  because  all  histories  are  grouped 
within  a  single  causal  state,  a. 

The  transition  matrices  T  and  stationary  causal  state  distribution  vr  are  trivial  in  this  instance, 
as  there  is  only  a  single  causal  state; 


1 

j^iH)  _ 

1 

2 

7"('r)  _ 

1 

2 

TT  = 

1 

Figure  14:  Diagram  of  a  fair  coin  toss 

As  we  are  now  in  possession  of  {S,T),  we  have  specified  the  fair  coin’s  e-machine.  Given  this 
information,  the  predictive  complexity  of  the  fair  coin  is  easily  calculable; 

C;  =  if[5]  =  - j;Pr(5)lgPr(5) 
ses 

=  -(1-0) 

=  0  bits. 

It  is  no  surprise  that  this  is  the  case,  as  any  process  with  only  a  single  causal  state  cannot  carry 
any  information  about  its  own  history.  In  fact,  the  history  is  entirely  irrelevant  to  the  fair  coin  at 
any  time,  as  it  can  never  deviate  from  a  single  state.  The  entropy  rate  is 

V  =  H[X\S]  =  -Yl 

S&S  x,S' 

=  -1 
=  1  bit. 

This  result,  too,  is  easily  intuitively  verified.  The  probability  distribution  is  mixed  maximally 
between  the  two  outcomes,  so  each  new  symbol  generates  a  new,  fully  random,  bit  of  information. 
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Golden  mean:  The  golden  mean  process  is  that  which  generates  all  binary  strings  except 
those  which  contain  00  (in  this  case,  BB).  It  has  two  states  S  =  {a,  /?},  which  can  be  thought  of  as 
all  histories  ending  in  an  A  and  all  of  those  ending  in  a  B.  If  the  current  state  Sq  is  a,  then  there 
is  a  50%  chance  of  the  next  character  being  an  A  and  50%  chance  of  it  being  a  B.  However,  if  Sq 
is  /?,  the  next  character  will  be  an  A  with  certainty.  This  is  represented  in  the  transition  matrices. 
Solving  for  tt  is  straightforward,  knowing  the  eigenvalue  to  be  1. 


Figure  15:  Representation  of  golden  mean  process 
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The  complexity  is  calculated  using  equation  2: 


C;  =  H[S] 


-  J]Pr(5)  lgPr(5) 

ses 


2^2  11, 

T®3  +  3‘®3> 


^-lg3 


lg3  —  -  Ri  0.9183  bits 

O 


And  the  entropy  rate  using  equation  3.  Note  that  it  is  essentially  the  averaged  entropy  of  all  of  the 
causal  states,  a  is  maximally  distributed  between  two  states  and  so  carries  1  bit  of  entropy,  and 
occurs  I  of  the  time  in  the  stationary  distribution.  /3  carries  no  entropy,  occurring  ^  of  the  time. 
Thus,  this  result  is  easily  predictable. 

h,  =  H[X\S]  =  -Y^  Pr(5)  45  Ig  45 

S&S  x,S' 

2  ,  . 

=  -  bits 
3 
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Even  process:  The  even  process  looks  much  like  the  golden  mean,  but  with  different  labeling. 
It  generates  a  sequence  of  B  symbols,  from  which  it  is  impossible  to  know  the  causal  state,  along 
with  some  synchronization  symbols  A,  from  which  the  state  can  be  determined.  It  is  the  first 
example  considered  in  which  the  state  of  the  system  is  truly  “hidden.”  The  decomposition  of  the 
transition  matrix  is  slightly  different,  but  the  result  is  identical  to  that  of  the  golden  mean. 


i|2l 


Figure  16:  Representation  of  even  process 


'l  l’ 
2  2 

j'iA)  _ 

'h  o’ 

j'iB)  _ 

’o  ^ 

TT  = 

2  1 

1  0 

1  0 

0  0 

3  3 

Because  the  state  distribution  is  identical  to  the  golden  mean,  the  statistical  complexity  is  also 
the  same. 

c;  =  H[s]  = 


The  entropy  rate  is  not  guaranteed  to  be  the  same  as  that  of  the  golden  mean,  as  the  individual 
symbol  transition  matrices  are  distinct: 

K  =  H[X\S]  =  -Y,  Pr(5)  42  Ig  42 

Se5  x,S' 

2  ,  . 

=  -  hte 

However,  the  entropy  rate  does  end  up  ultimately  being  the  same. 


J]Pr(5)lgPr(5) 

ses 

2,2  11, 
3‘®3  +  3‘«3> 


lg3  -  -  bits 
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B  Derivation  and  extremization  of  S  for  Flower  process 


B.l  Uniform  outgoing  forward-time  probability  distribution 


The  outgoing  forward-time  probability  distribution  affects  the  irreversibility  of  the  Flower  process 
in  the  following  way.  Forward  complexity  is  simply  requires  application  of  equation  2.  Since  all 
petal  states  have  certainty  of  transitioning  to  the  central  state,  its  probability  in  the  stationary 
distribution  is  Thus 

N 

=  -  Y.PT{S,)lgFv{Si) 

i=l 

1  ^ 

=  -- j;Pr(5,)lgPr(5,) 

i=2 


We  wish  to  maximize  this  function  if  we  are  to  control  the  entropy.  Intuitively  this  is  given  either 
by  creating  one  petal  for  which  Pr(5')  ~  1,  or,  by  symmetry,  by  distributing  the  probability  equally 
amongst  all  petals.  Because  lima;_^o  2;  Ig  =  Fm.x^ix\gx  =  0,  if  we  were  to  group  all  of  the 
probability  in  a  single  element  the  Shannon  entropy  of  the  petals  will  vanish.  We  distribute  the 
probability  evenly  among  petals: 


N 


1  1 


i=2 

N 

E 


i=2 

=  l  +  liN-l) 


2N  -I  -I 

1 


iV-  1 


Ig  (2(iV-l)) 


2  2 

=  \  +  \  -  1)  +  1] 

=  1+0  lg(-^  “  1)  =  1  +  O  Ig(l^petals) 


(11) 


The  forward-time  complexity  of  the  evenly  distributed  Flower  process  diverges  as  0(lg  V),  whereas 
the  alphabet  size  diverges  as  Q{N). 

For  ease  of  notation,  we  redefine  N  and  M  to  be  the  number  of  forward  and  reverse  petal  states, 
respectively,  of  a  particular  e-machine;  that  is  N  =  A^petais  =  N  —  1  and  M  =  Mpetais  =  M  —  1  (the 
N  and  M  on  the  right-hand  sides  of  these  equations  represent  the  Flower  parameters  and  the  left- 
hand  sides  their  usage  here).  Each  of  the  forward-time  petal  states  Si  has  an  incoming  transition 

(x) 

probability  p\  associated  with  each  reverse  symbol  x.  Time- reversing  the  system,  after  transposing 
we  have  to  normalize  each  of  these  probabilities  by  the  corresponding  outgoing  probability,  which 
we  know  in  this  case  to  be  ^  since  the  “incoming”  forward-time  probability  distribution  is  uniform. 
Analysis  of  the  mixed-state  presentation  is  aided  by  the  form  of  the  state  transition  matrix  for  the 
Flower  process.  It  is,  in  the  general  case,  P^P) 


T  = 


0 

Iv 


p(i) 

N 

0 


p{M) 

~w~ 

0 


Where  boldface  numerals  indicate  column  vectors.  The  decomposition  is  simple;  the  “outgoing” 
state  corresponds  to  a  predominantly  empty  matrix  holding  the  entry  in  the  1  column  vector. 
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The  “incoming”  states  are  the  same,  but  with  each  containing  its  total  probability  at  its  position 
j  along  row  0.  Using  these  matrices  to  form  the  mixed  state  yields  the  following  stationary  distri¬ 
bution  without  difficulty,  since  again  we  know  that  after  transitioning  to  a  “petal”  state,  a  return 
to  the  central  (reverse)  causal  state  is  certain: 


Therefore  the  reverse  statistical  complexity  is 

N 

C’;  =  -j;Pr(5,)lgPr(5,) 

=  l-^^lgf^^  (12) 

2  ^  y2N  j  ^  ’ 

And  finally,  adding  equations  11  and  12,  the  causal  irreversibility  of  a  uniformly  forward-distributed 
{N,  M}-Flower  machine  is 


-  +  -igiv  + j;  — Igl  — 

j=i  \ 


In  order  to  maximize  the  uniform-outgoing  Flower  process,  we  differentiate  equation  13  with 
respect  to  a  vector  p  composed  of  the  incoming  transition  probabilities  p  =  L(l)  p(2)  .  .  .  piM)  ^ 

where  G  (0, 1)  V  j  G  {1, . . . ,  M}.  This  vector,  being  stochastic,  must  satisfy  the  law  of  total 
probability,  where  the  vector  sums  to  N  instead  of  to  1  because  the  values  given  in  p  are  distributed 
over  N  forward  causal  petals,  each  of  which  must  sum  to  1: 

M 

IpI  =  ^  (14) 


This  gives  the  constraint  on  probabilities  as  a  relationship  between  all  variables  ,  namely  that 
they  sum  to  N.  From  this  we  can  deduce  a  rule  for  partial  derivatives: 


1,  i=j 

-1, 


In  order  to  find  extrema  of  S  subject  to  the  constraint  defined  in  equation  14  we  use  the  method 
of  Lagrange  multipliers.  The  Lagrangian  is 


A  =  VS  +  AV(A^-|p|) 
^  2iV  ^y2Nj 
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Taking  the  Lagrangian  to  be  stationary  gives,  V  A:  <  M, 


M 


dp(^) 


7  =  1 


M 


ME 


gp(fe)  9p('=) 


dp(^)  \Z^^2N  2Nj 

1  /  M  (j) 

2N  dpC^'f  \  Z^P  2iV 

\j¥=k 


2N 


A[(M-1)(-1)  +  1]  = 


(2  -  M)A  = 


2iV(2  -  M)A  = 


1 

2N 


M 


..s  „  p(^' 


^1  ^ 

^  1  apt'')  2Ai  ^  2A 


2iV(2  -  M)A  = 


[  Mlw^  +  pOO^ig^ 

I  apt'')  2A^  ^  ^  2N 

j^k  \ 

Sr^  (  (j)  ^  ainptl)  I  I  (k)  1  a  In  pt'')^ 

^  In  2  ^  2iV  j  y  ^  2iV  ^  In  2  j 

^  f p(j)  ^  ainpO)apO)  + 

In 2  9p(i)  apt'')  2iV  J  ^  y^  2N  ^  ln2j 

^  /  ,  pM)  \  p(fc)  2  -  M 
^  ^  j  ^  In  2 


1 

M 


j¥=k 


(2  -  M)  2iVA  - 


is^ 

&  --rM 


In  2 

(fc) 


p 


,(fc) 


(») 


M  (,•) 

,lgi^_lgTTEi! 

^  2iV  ^ 2iV 

ji^k 

,2  _  M)  (2JVA  -  T 


p\ 


=  c 


(17) 


That  the  quantity  on  the  left-hand  side  of  equation  17  is  equal  to  the  constant  C  for  all  values 
of  k  implies  that  ^  y  m,n  <  M  is  the  extremum  of  the  function.  However,  we  note 

that  V^H  is  positive  for  all  values  of  p.  Thus,  the  maximum  of  the  function  must  occur  on  the 
boundary;  symmetry  again  dictates  that  the  optimal  case  is  to  set  =  1  for  some  k  and  have 
pU)  =  0  for  all  others,  but  we  cannot  achieve  this,  as  the  topology  of  the  Flower  machine  breaks 
when  we  set  any  p  =  0  and  we  enter  a  new  regime.  Therefore,  this  value  of  H  is  a  supremum.  We 
can  approach  this  by  setting  p^^^  ~  1;  V  j  7^  /c,  p^^'^  ~  0.  The  behavior  of  S  in  response  to  changes 
in  probability  is  demonstrated  in  the  simple  case  of  a  Flower  machine  having  two  time-reversed 
petals  in  figure  7  in  section  2.3. 
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B.2  General  case 

We  wish  to  eliminate  the  assumption  that  a  uniform  outgoing  distribution  maximizes  causal  ir¬ 
reversibility.  The  solution  will  not  be  given  here,  but  the  form  of  H  is  derived.  We  impose 
the  convention  that  i  be  used  to  index  forward-time  states  summed  over  N ^  as  in  the  prob¬ 
abilities  i  G  where  N  is  the  number  of  forward  petal  states,  and  j  be  used 

to  index  reverse-time  states  summed  over  M,  as  in  the  forward-time  “incoming”  probabilities 
P^yy  i  £  {!)  •  •  • )  j  £  {!)  •  •  • )  where  M  is  the  number  of  reverse  petal  states.  The  time- 
reversed  probabilities  are  represented  as  and  iG{l,...,A^},  j  G  {1, M}.  If  we  relax 
the  requirement  that  the  Flower  process  possess  uniform  outgoing  distribution  probabilities,  the 
irreversibility  takes  the  following  general  form  given  in  terms  of  forward-time  probabilities: 


s  =  g;  -  c- 


N+l 

j;Pr(5+)lgPr(5+) 


1 


1 


/ 


2  2 
( 


M+l 

J]Pr(S'-)lgPr(5: 
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E 


V 


V 

/  \  *  ^(b 


i-EIE^  >dE^’ 

'•  ^(b  /  \  i  ^(b 


\ 


J 


(18) 


We  wish  to  maximize  this  quantity  via  the  sets  of  parameters  {p~^y,p^jy]  subject  to  constraints 
originating  from  the  law  of  total  probability. 

Ej>J)  = ' 

i 

E^Sb  =  ^ 


We  note  the  following  differential  relations,  easily  derived  from  the  previous  equations. 


_  J  ^  ^ 


1^-1,  i^k 

The  Lagrangian  for  the  general  case  is 


dpt  ■, 


0,  m  /  j 

1,  m  =  j,  k  =  i 

—  1,  m  =  j,  k  i 


A  =  VE  +  X,v[l-Ypt)  '  +^2V 


Because  the  constraining  functions  are  functions  only  of  one  of  the  sets  of  probabilities,  they 
will  be  treated  independently  in  the  gradient.  The  stationary  points  of  A  for  all  probabilities 
P(mfc)’  ^  Z  Af,  k  <  N  satisfy 
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The  resulting  equation  is  of  the  form  C  =  fmdk-  Since  this  holds  (with  the  same  constant)  for  all 
m  and  k,  which  may  be  varied  independently  of  one  another,  then  both  functions  must  be  constant 
in  m  and  k.  This  leads  to  the  conclusions  that  p'^-^  =  ;^  Vi  <  and  P^ji)  is  identical  Vj  <  M. 
These  are  precisely  the  conditions  applied  previously  to  minimize  the  complexity  of  the  Flower 
process. 

Again  by  symmetry,  finding  the  maximum  of  H  requires  finding  a  point  that  lies  on  the  boundary 
of  the  product  space  of  the  simplices,  which  the  method  of  Lagrange  multipliers  cannot  do.  It  may 
require  numerical  solution. 
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C  Survey  of  topological  e-machines 


The  class  of  topological  e-machines  is  composed  of  e-machine  structures,  stripped  of  transition 
probabilities.  A  method  for  enumerating  all  such  machines  was  given  in  [5]  and  implemented  in 
the  CMPy  package  for  Python,  allowing  us  to  perform  an  exhaustive  survey  up  to  a  given  number 
of  causal  states  n.  The  number  of  processes  increases  exponentially  with  the  number  of  states, 
making  this  survey  a  difficult  task  even  at  relatively  low  n.  The  survey  was  performed  in  parallel 
on  the  UC  Davis  CSC  Hive  cluster.  Each  e-machine  was  time-reversed  and  categorized  as  reversible, 
finitely  irreversible,  or  infinitely  irreversible  based  on  whether  the  mixed-state  process  converges. 
The  results,  given  in  figure  17  show  that,  while  processes  with  low  numbers  of  states  are  mostly 
reversible,  with  increasing  states  the  proportion  of  irreversible  and  infinitely  irreversible  machines 
grows.  It  is  impossible  to  make  definite  statements  given  the  result,  but  it  could  be  the  case  that 
the  proportion  of  e-machines  which  are  reversible  dwindles  with  larger  numbers  of  states,  while  the 
proportion  of  finitely  irreversible  e-machines  is  high  in  the  mesoscale  but  either  drops  at  high  n 
or  equalizes  with  the  proportion  of  infinitely  irreversible  e-machines,  which  suggests  and  upward 
trend  at  the  high-n  edge  of  figure  17(b). 


Topological  e-machine  proportions 


(a) 


(b) 


Figure  17:  Results  from  topological  e-machine  survey,  with  number  of  states  given  on  a  semilog 
plot  (a)  and  the  relative  frequency  of  each  category  (b) 
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